# Replication Package for "Measuring How Much Judges Matter for Case Outcomes"

This repository contains the data and R code necessary to replicate the results presented in the article:

> **Copus, Ryan and Ryan Hübert**. Forthcoming. "Measuring How Much Judges Matter for Case Outcomes." _Journal of Law and Courts_.

---

## Repository Structure

```
├── Code/              # R scripts for analysis and figure/table generation
├── Data/              # Raw and/or processed data files
├── Manuscript/        # Manuscript and references
├── Outputs/           # Generated figures and tables
├── README.md          # This file
```

### Handling the Working Directory

The scripts in this repository use the `here` package to try to quietly and intelligently set the proper working directory. However, setting working directories is known to cause problems in R (and especially RStudio), so a user may need to manually set their working directory.

---

## Software

The last successful execution of the code in this repository was in March 2025, using RStudio on an Apple MacBook Pro (November 2021 M1 Max version with 64 GB RAM). It took approximately 2.5 hours to run, from start to finish.

The software versions used were as follows:

- `RStudio` version 2024.12.1+563 "Kousa Dogwood" Release (27771613951643d8987af2b2fb0c752081a3a853, 2025-02-02) for macOS
- `R` version 4.4.3 (2025-02-28) "Trophy Case"
- `java` version 1.8.0_331

The following R packages were used:

- `ade4` version 1.7-22
- `here` version 1.0.1
- `h2o` version 3.44.0.3
- `parallel` version 4.4.3
- `pROC` version 1.18.5
- `tidyverse` version 2.0.0
- `wfe` version 1.9.1
- `xgboost` version 1.7.8.1

**Please ensure the required software (and R packages) are installed prior to replicating the results from this paper.**

### Special Notes about `h2o`

The requirements for using the `h2o` package are described here: <https://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html#requirements>.

Installation instructions for the `h2o` package available at the following website: <https://docs.h2o.ai/h2o/latest-stable/h2o-docs/getting-started/r-users.html>. 

There is a useful "cheatsheet" for the `h2o` package, which is available here: 
<https://github.com/rstudio/cheatsheets/blob/main/h2o.pdf   

The XGBoost machine learning algorithm is used in two of the R scripts in this repository. However, as of March 2025, the `h2o` pacakage's implementation of this algorithm (via the function `h2o.xgboost()`) is not an available on Windows or on some Macs with silicon chips. The code in the two scripts using the algorithm has been structured to detect whether the `h2o.xgboost()` function is available, and if not, to instead use R's `xgboost` package. This means that results may vary slightly across machines.

---

## Replication Instructions

First, download the full repository to your computer.

There are two ways to replicate the results. First, you can choose to replicate the entire analysis from start to finish by running the R script titled `0.FullReplication.R`. Second, you can choose to replicate only parts of the analysis by running any one of the following R scripts. However, note that they should be run in sequence.

- `Code/1.CalculateDeviance.R`: This script creates a model of reversal that does not include panel predictors. This model captures variation in reversal that is do to case predictors only. The predictions from this model are then residualized from the outcome variable (reversal), into a new variable we call `deviance` (indicating the case's deviance from the court-level expectation based only on case characteristics). The model's predictions (called `expected`) and each case's `deviance` score is saved in a dataframe called `pf` in `Data/deviance.rdata`.

- `Code/2.GeneratePRQs.R`: This script creates a model of deviance using panel characteristics (e.g., assigned judges and their demographic characteristics). This model captures the distinctive effect that assigned panel judges have on each case's outcome. Then, using this model, we simulate how approximately 1,000 "counterfactual" panels and the actual assigned panel would have decided the case. We are left with case-level predictions for this set of panels and can thus calculate the actual assigned panel's quantile in the prediction distribution. This is each case's PRQ, which is called `panel.pred` and is saved in a dataframe called `pf` in `Data/perc.rdata`.

- `Code/3.CheckRandomization.R`: This script verifies that cases are randomly assigned to panels in our dataset by testing whether case characteristics predict whether a panel is a majority Republican appointed or a majority Democratic appointed panel. If cases are truly randomly assigned, panel charactersitics should be uncorrelated with case characteristics and we should be unable to predict panel partisanship with case characteristics. Indeed, we cannot, as evidenced by the predicted probabilities stored in `Data/randomization.rdata` and used to plot an ROC curve in `Code/4.FiguresTables.R`.

- `Code/4.FiguresTables.R` This script creates tables and figures that are saved in the `Outputs` folder, and which are used in the manuscript.

You can also choose to compile the manuscript using LaTeX, which is located in the `Manuscript` folder.

There are two additional R scripts, `_config.R` and `_utils.R`, which take configuration steps and define useful functions. A user should not need to modify either script. **Please note that `_config` will automatically install any required package.**

---

## Data Availability

The data in this repository can be used according to the terms and conditions of the publishing journal. 

Please note that all judge identifiers have been anonymized. 

---

## Citation

If you use this code or data, please cite the original article:

> **Copus, Ryan and Ryan Hübert**. Forthcoming. "Measuring How Much Judges Matter for Case Outcomes." _Journal of Law and Courts_.

You may also cite this replication package as:

> **Copus, Ryan and Ryan Hübert**. 2025. _Replication Package for "Measuring How Much Judges Matter for Case Outcomes"_. Harvard Dataverse. 

---

## Contact

For questions or clarifications, please contact:

- **Ryan Copus**, Associate Professor, University of Missouri--Kansas City School of Law, [copusr@umkc.edu](mailto:copusr@umkc.edu), <https://law.umkc.edu/profiles/faculty-directory/copus-ryan.html>. ORCID [0000-0001-5242-9480](https://orcid.org/0000-0001-5242-9480). 
- **Ryan Hübert**, Associate Professor, London School of Economics and Political Science, [r.hubert@lse.ac.uk](mailto:r.hubert@lse.ac.uk), <https://ryanhubert.com/>. ORCID [0000-0003-1556-4127](https://orcid.org/0000-0003-1556-4127). GitHub profile: <https://github.com/ryanhubert>. 